Skip to content

winch: respect the enable_nan_canonicalization setting#12939

Merged
saulecabrera merged 9 commits intobytecodealliance:mainfrom
r-near:winch-nan-canonicalization
Apr 7, 2026
Merged

winch: respect the enable_nan_canonicalization setting#12939
saulecabrera merged 9 commits intobytecodealliance:mainfrom
r-near:winch-nan-canonicalization

Conversation

@r-near
Copy link
Copy Markdown
Contributor

@r-near r-near commented Apr 2, 2026

The enable_nan_canonicalization flag already flows through to Winch via the shared Flags, but Winch was ignoring it. This adds maybe_canonicalize_nan (scalar) and maybe_canonicalize_v128_nan (SIMD) methods to the Masm trait that, when the flag is set, emit a NaN-detecting sequence to replace results with the canonical quiet NaN.

  • Scalar: compare-with-self + conditional branch to load canonical NaN. Implemented for x64 and aarch64
  • SIMD (x64): unordered compare to build a NaN lane mask, then bitwise select with canonical NaN from the constant pool

Covers add, sub, mul, div, min, max, sqrt, ceil, floor, trunc, nearest, demote, and promote for both scalar and SIMD. Canonical NaN constants are shared across ISAs. Removes the canonicalize-nan.wast Winch skip.

Copilot AI review requested due to automatic review settings April 2, 2026 04:50
@r-near r-near requested review from a team as code owners April 2, 2026 04:50
@r-near r-near requested review from alexcrichton and removed request for a team April 2, 2026 04:50
@r-near r-near marked this pull request as draft April 2, 2026 04:52
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR ensures Winch respects the existing enable_nan_canonicalization shared setting by adding a canonicalize_nan hook to the MacroAssembler trait and invoking it after scalar floating-point operations so NaN results are normalized to the canonical quiet-NaN bit pattern.

Changes:

  • Add MacroAssembler::canonicalize_nan and implement it for x64 and aarch64.
  • Invoke NaN canonicalization after scalar float arithmetic/conversion ops (and after rounding paths in the visitor).
  • Add a scalar WAST test to validate canonical NaN bit patterns.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
winch/codegen/src/visitor.rs Inserts canonicalize_nan after scalar float ops and adds helper for rounding results.
winch/codegen/src/masm.rs Extends the MacroAssembler trait with canonicalize_nan.
winch/codegen/src/isa/x64/masm.rs Implements NaN detection + replacement with canonical NaN on x64.
winch/codegen/src/isa/aarch64/masm.rs Implements NaN detection + replacement with canonical NaN on aarch64; stores shared flags in the masm.
tests/misc_testsuite/canonicalize-nan-scalar.wast Adds scalar coverage for canonical NaN bit patterns and propagation.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@r-near r-near force-pushed the winch-nan-canonicalization branch from fc5efb7 to 97d89bb Compare April 2, 2026 04:58
@r-near r-near force-pushed the winch-nan-canonicalization branch from 97d89bb to 0c5e168 Compare April 2, 2026 04:59
@r-near r-near marked this pull request as ready for review April 2, 2026 05:13
@github-actions github-actions bot added the winch Winch issues or pull requests label Apr 2, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 2, 2026

Subscribe to Label Action

cc @saulecabrera

Details This issue or pull request has been labeled: "winch"

Thus the following users have been cc'd because of the following labels:

  • saulecabrera: winch

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@r-near
Copy link
Copy Markdown
Contributor Author

r-near commented Apr 2, 2026

@cfallin Thanks for reviewing #12940! Would you mind taking a look at this one as well?

@cfallin
Copy link
Copy Markdown
Member

cfallin commented Apr 2, 2026

@r-near we usually either stick with the assignment bot's suggestions (the reason I reviewed your last PR) or, in cases such as this one, tag a sub-area reviewer who knows part of the code best -- in this case @saulecabrera might be best to review Winch-specific work?

@saulecabrera
Copy link
Copy Markdown
Member

I'll review.

@saulecabrera saulecabrera requested review from saulecabrera and removed request for a team and alexcrichton April 6, 2026 13:49
Copy link
Copy Markdown
Member

@saulecabrera saulecabrera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, some comments inline.

Ok(())
}

fn canonicalize_nan(&mut self, reg: WritableReg, size: OperandSize) -> Result<()> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the canonicalization here is optional, let's rename this method to maybe_canonicalize_nan.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, could you expand a bit on the rationale for adding a per-Masm implementation instead of an ISA-agnostic one? A priori, it seems that the MacroAssember exposes enough functionality to express canonicalization one layer above.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trait doesn't have float-and-compare branch or float-constant-load, so an agnostic version would need to go through float_cmp_with_set plus an integer branch, which is a few more instructions than the native path each backend can do.

I was leaning per-ISA, but perhaps we just add those primitives to the trait?

BTW I added the shared constants so that should at least cover the duplication concern.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed the method here: 7cdd58e (this PR)

Comment on lines +700 to +701
OperandSize::S32 => &0x7FC00000u32.to_le_bytes(),
OperandSize::S64 => &0x7FF8000000000000u64.to_le_bytes(),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid repeating these immediate values across ISAs, can we introduce a constant at the Masm layer and import them here instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, here: 75cecea (this PR)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we also implement this functionality for the non-scalar counterparts? I believe that all vector instructions are implemented in order to respect canonicalization. Once implemented we should be able to fully test simd/canonicalize-nan.wast in x86-64, by removing this line https://github.com/bytecodealliance/wasmtime/blob/main/crates/test-util/src/wast.rs#L614

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok here we go: cdef987 (this PR).

Added maybe_canonicalize_v128_nan to the masm trait and implemented it for x64

I removed the skip and it looks like the tests pass 🎉

where
M: MacroAssembler,
{
fn canonicalize_nan_for_round(&mut self, size: OperandSize) -> Result<()> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we replace this function with a a call to self.masm.canonicalize_nan passing in the return register of the builtin function instead? The register holding the return value should be accessible via the builtins struct. That way we can avoid a function call (to canonicalize_nan_for_roundand going through all the hoops involved in pop_to_reg) and more importantly we reduce the risk of this function getting used in context different to round-result canonicalization, for which as far as I can tell, we are not enforcing any particular invariants.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ended up removing the function here: 1722807 (this PR).

I kept the pop/canonicalize/push inline since pop_to_reg is essentially a no-op when the value is already in a register... idk, it looked like wiring up the return register through float_round felt more invasive, but that might be a more purist solution

Copy link
Copy Markdown
Member

@saulecabrera saulecabrera left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks for iterating on this. One last request and then we can land this one.

Ok(())
}

fn maybe_canonicalize_nan(&mut self, reg: WritableReg, size: OperandSize) -> Result<()> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be worth expressing the scalar version as the vector counterpart (using a single lane), to make it branchless. We'd probably need to fallback to the branch version when avx instructions are not supported though. For now let's leave a comment here stating that in the future if a more performant version is needed, a variation of the v128 branchless could be considered for scalars.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saulecabrera saulecabrera added this pull request to the merge queue Apr 7, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 7, 2026
@r-near
Copy link
Copy Markdown
Contributor Author

r-near commented Apr 7, 2026

@saulecabrera could you re-add it to the merge queue? I fixed the test failure

@saulecabrera
Copy link
Copy Markdown
Member

That is the right fix, yeah.

@saulecabrera saulecabrera added this pull request to the merge queue Apr 7, 2026
Merged via the queue into bytecodealliance:main with commit 9f93e49 Apr 7, 2026
48 checks passed
alexcrichton added a commit to alexcrichton/wasmtime that referenced this pull request Apr 9, 2026
Behavior of Winch changed in bytecodealliance#12939 in such a way that it's breaking
differential fuzz testing with wasmi. Now that Winch supports NaN
canonicalization this commit adjusts the fuzz test configuration
generation to additionally enable it for Winch in addition to Cranelift.
github-merge-queue bot pushed a commit that referenced this pull request Apr 9, 2026
Behavior of Winch changed in #12939 in such a way that it's breaking
differential fuzz testing with wasmi. Now that Winch supports NaN
canonicalization this commit adjusts the fuzz test configuration
generation to additionally enable it for Winch in addition to Cranelift.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

winch Winch issues or pull requests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants